In part 1 of this demo series, I described OpenStreetMap’s data structure. In this one, let’s dive into some code.

Download OpenStreetMap data into R using osmdata

The package I use to get OpenStreetMap data is osmdata, by Mark Padgham and Robin Lovelace. The main purpose of this package is to download vector data from OSM, for example the roadways and bike infrastructure described above.

First, let’s get a feel for the osmdata package by downloading Atlanta’s roads. Then we’ll focus on a smaller area to explore bike infrastructure.

Use osmdata to download Atlanta’s roads

Install and load osmdata. I’ll also use the tidyverse, sf, mapview, and ggmap.

library(osmdata)
library(tidyverse)
library(sf)
library(mapview)
library(ggmap)

The three workhorse functions of this package are opq(), add_osm_feature(), and osmdata_sf().

  • opq(), short for overpass query, uses the overpass API to query OpenStreetMap data. The great thing about this package is one need not know about the overpass API to make use of the data. The important argument it takes is a bounding box (bbox=), which can be simply the name of a place like a city. Or, alternatively, four lat/lon values can be used. This function can be used on its own to simply request all of the data within a bounding box. It’s often more useful to request specific features, which is done using:
  • add_osm_feature(). This function takes a key and returns objects with values for that key. Recall, a highway (i.e., a roadway or path) is a type of key.
  • osm_data_sf() passes the queries specified by the bounding box and the requested features to the overpass API and returns an sf object.

These three functions can be piped (%>%) together. For example, suppose we’d like vector data corresponding to the primary roadways in Atlanta.

atl_primary = opq(bbox = "Atlanta, Georgia, USA") %>%
  add_osm_feature(key = "highway", value = "primary") %>%
  osmdata_sf() %>%  
  .$osm_lines %>% #Extract the osm_lines, specifically.
  st_as_sf() #Confirm it's sf.

atl_primary %>% mapview()

For context, let’s add secondary roads, tertiary roads, trunk roads, and the interstate highways. We’ll leave out residential roads, as the data become too big. The code is repetitive, but I’ve had better luck creating an sf object for each highway type rather than calling all types in one query using sequential add_osm_feature() function calls.

atl_motorway = opq(bbox = "Atlanta, Georgia, USA") %>%
  add_osm_feature(key = "highway", value = "motorway") %>%
  osmdata_sf() %>%  
  .$osm_lines %>%  
  st_as_sf()  

atl_trunk = opq(bbox = "Atlanta, Georgia, USA") %>%
  add_osm_feature(key = "highway", value = "trunk") %>%
  osmdata_sf() %>%  
  .$osm_lines %>%  
  st_as_sf()  

atl_secondary = opq(bbox = "Atlanta, Georgia, USA") %>%
  add_osm_feature(key = "highway", value = "secondary") %>%
  osmdata_sf() %>%  
  .$osm_lines %>%  
  st_as_sf()  

atl_tertiary = opq(bbox = "Atlanta, Georgia, USA") %>%
  add_osm_feature(key = "highway", value = "tertiary") %>%
  osmdata_sf() %>%  
  .$osm_lines %>%  
  st_as_sf()  

Each of those objects are sf objects, so they can be manipulated using dplyr verbs.

class(atl_primary)
## [1] "sf"         "data.frame"

Let’s bind the sf data together using bind_rows().

atl_roads =  atl_primary %>%
  bind_rows(
    atl_motorway,
    atl_trunk,
    atl_secondary,
    atl_tertiary
  )

#Take a look at the number of features of each highway type.
table(atl_roads$highway)
## 
##  motorway   primary secondary  tertiary     trunk 
##       742       865      1843      1379        48

Tangent to visualize a smaller mapview

To visualize these all together, I’m going to first pare down to a 1-mile radius around Five Points, as I was having issues rendering the larger mapview to html. In your own console, this simple call to mapview(atl_roads) should work for a quick visual.

atl_roads %>% 
  mapview(zcol = "highway")

I’ll use mutate_geocode() from ggmap to geocode the address and will create a 1-mile buffer around it using st_buffer() from sf.

five_points_1mi_rad = as_tibble("Five Points, GA")  %>% 
  as_tibble() %>%
  mutate_geocode(value, force = TRUE) %>% 
  st_as_sf(coords = c("lon", "lat"),crs = 4326) %>% #the output at this step is a point.
  #change the coordinate system to one in feet.
  st_transform("+proj=tmerc +lat_0=30 +lon_0=-84.16666666666667 +k=0.9999 +x_0=699999.9999999999
               +y_0=0 +datum=NAD83 +units=us-ft +no_defs +ellps=GRS80 +towgs84=0,0,0") %>%
  st_buffer(5280) %>% 
  st_transform(4326) #transform back to 4326 so it jives with the default coordinate system output of the OpenStreetMap data above.
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=Five+Points,+GA&key=xxx-julIPmYuW18b675rSafE

Find the intersection of the OSM vector data with this buffer radius using st_intersection(), and visualize using mapview().

library(RColorBrewer) #to get a better palette.
set1_5 = RColorBrewer::brewer.pal(n=5, name  = "Set1") 
atl_roads %>%
  st_intersection(five_points_1mi_rad) %>%
  dplyr::select(osm_id, name, HFCS, bicycle, highway) %>%    #pick a few variables
  mapview(
    zcol = "highway",
    color =set1_5,
    map.types = c("CartoDB.DarkMatter", "CartoDB.Positron", "OpenStreetMap")
  )
## although coordinates are longitude/latitude, st_intersection assumes that they are planar

Use osmdata to gather bicycle infrastructure around the intersection of Monroe and Ponce.

Now let’s construct a dataset of bicycle infrastructure. Then, we’ll focus on a smaller area and see if we can build a dataset of bicycle infrastructure. Focus on a smaller area and try to accurately classify infrastructure

Our goal here is to demonstrate how to bring in some of OpenStreetMap data into R.

https://cran.r-project.org/web/packages/osmdata/vignettes/osmdata.html

Cycleway may either be its own way or cycleway may be the value of h There is some good discussion on this topic here () and in the article by Ferster & colleagues.

berne st is difficult, because there is a bike lane westbound but a sharrow eastbound.

let’s pick a small area, an area that includes ponce (buffered), the beltline (a trail), and 10th (a protected lane)